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Abstract 

Under  mild  a,ssiiiiii)tions,  tlie  classical  Farkas  lemma  approach  to 
Lagrange  mnltiplier  theory  is  extended  to  an  infinite  programming  for- 
mnlation.  The  main  result  generalizes  the  usual  first-order  necessity 
conditions  to  address  prol)lems  in  which  the  domain  of  the  objective 
function  is  Hilbert  space  and  the  number  of  constraints  is  arbitrary. 
The  restilt  is  used  to  obtain  necessity  conditions  for  a  well-known  prob¬ 
lem  from  the  statistical  literature  on  i)robability  density  estimation. 


Key  words:  Lagrange  inuUi[)lier  theory,  Farkas  lemma,,  infinite  program¬ 
ming,  mathematical  progra.mming. 


1  Introduction 


In  1951,  Kuhn  and  Tucker  [14]  (levelo})ed  a.  Lagrange  multiplier  theory  for 
mathematical  progra, mining  prohlems  that  contain  inequality  constraints.  In 
this  theory,  the  domain  of  the  objective  function  is  Euclidean  space  and  the 
constraint  functionals  are  indexed  by  a  hnite  set.  Their  development,  expli¬ 
cated  and  popularized  by  Fiacco  and  Mcdormick  ([6],  Chapter  2),  invokes 
the  classical  Farkas  lemma.  (Farkas  [5])  to  generate  a  vector  (the  Lagrange 
multipliers)  that  can  be  viewed  a.s  a  “weighting”  of  the  finite  set  of  con¬ 
straints.  For  years  we  have  taught  this  material,  each  time  pondering  the 
extent  to  which  this  develojuuent  of  Lagrange  multiplier  theory  depends  on 
the  finite  dimensionality  of  Euclidean  .spa.ce  and  the  finiteness  of  the  con¬ 
straint  set.  Somewhat  recently,  our  interest  was  enhanced  when  we  learned 
of  an  interesting  infinite  programming  problem  in  the  statistics  literature  for 
which  we  could  state  a  formal  generalization  of  the  usual  first-order  neces¬ 
sity  conditions,  with  no  known  tlnxuetical  justification  for  doing  so.  In  the 
present  study  of  Lagrange  mvdti})lier  theory,  we  have  not  only  succeeded  in 
generalizing  the  Farkas  lemma  a.i)i)roa.ch,  but  have  also  acrpiired  new  insight 
into  the  essential  features  of  that  apj)roach. 

Of  late,  it  has  l)ecome  fashional)le  to  refer  to  the  first-order  necessity 
conditions  as  the  “Ka.rusli-Kuhn-Tu(  ker  conditions,”  rather  than  the  “Kuhn- 
Tucker  conditions.”  This  observation  motivated  us  to  carefully  study  the  ori¬ 
gins  of  these  conditions,  resulting  in  a  fascinating  excursion  into  the  history 
of  nonlinear  programming  and  the  classical  cah’nlns  of  variations.  Beca,nse  of 
the  pedagogical  nature  of  the  present  work,  we  believe  that  it  is  appropriate 
to  share  what  we  have  learned.  In  passing,  we  note  that  Prekopa  [17]  effec¬ 
tively  argues  that  fundaiueutal  ideas  ('oiicerning  the  optimality  conditions  for 
nonlinear  programming  can  lx*  found  in  some  early  papers  in  mechanics  by 
Fourier,  Cournot,  and  Farkas  and  also  by  Gauss,  Ostrogradsky  and  Hamel. 

In  the  1930s,  there  flourished  at  the  University  of  Chicago  a,  school  of 
thought  in  the  calculus  of  vaiiations  that  was  founded  by  G.  A.  Bliss.  Re¬ 
searchers  associated  with  this  s<’hool  included  L.  M.  Graves,  H.  H.  Goldstine, 
M.  R,.  Hestenes,  A.  S.  Householder,  W.  Karush,  E.  .1.  McShane,  W.  T.  Reid, 
F.  A.  Valentine,  and  many  others.  Lagrange  multiplier  theory  for  the  ecpiality 
constrained  mathematical  prograniming  ])roblem  was  known,  in  one  form  or 
another,  to  most  researchers  in  the  classical  calcidns  of  variations,  who  were 
fully  aware  that  it  coidd  l)e  derived  from  various  theories  for  more  general 


problems.  Bliss  [3]  presented  iui  elegant  exposition  of  this  theory  in  the  first 
section  ol  a  1938  snrvey  of  normality  and  alniormality  in  the  calculus  of  varia¬ 
tions.  The  first  section  ol  Bliss’s  article  is  entitled,  “Abnormality  for  minima 
of  functions  of  a  finite  nnml)er  of  variables,”  and  it  begins,  “The  significance 
of  the  notion  of  abnormality  in  the  calc.idns  of  variations  can  be  indicated 
by  a  study  of  the  theory  of  the  sinii)ler  problem  of  finding  . . . .”  (p.  367). 
What  is  particularly  enlightening  about  this  section  is  that  it  reveals  how  the 
Chicago  school  regarded  finite-dimensional  problems:  in  a  pre-computational 
era,  the  theoretically  less  challenging  case  of  finite  dimensions  was  primar¬ 
ily  valued  as  a  training  grotmd  for  develo)>ing  intuition  al)out  more  difficult 
problems. 

Another  interest  of  the  Chi<'ago  s('hool  was  the  incorporation  of  inequality 
constraints.  They  often  used  squared  slac  k  varialrles  to  extend  known  theory 
and  insight  from  the  more  standard  ecpiality  c’onstrained  problem  to  the  less 
standard  inequality  constrained  problem.  Hestenes  recalled  that  this  device, 
as  well  as  the  techniques  of  succ'essive  linear  and  quadratic  programming 
(techniques  usually  attril)uted  to  the  jxxst-war  mathematical  programming 
community),  was  a  standard  tool  of  the  Chicago  school. 

In  his  1937  Ph.D.  thesis,  Valentine  [23]  studied  the  problem  of  Lagrange 
with  differential  inequality  constiaints  of  the  form 

.'/(:'■,  y/,c/vy/</;r)  >  0. 

He  replaced  this  constraint  with  the  ecpiality  constraint 

where  z(x)  is  an  auxiliary  function  satisfying  a,  particular  initial  condition, 
and  applied  the  standard  theory.  Today,  many  authors  refer  to  the  use  of 
squared  slack  variables  as  the  “Method  of  Valentine.”  The  first  author  of  the 
present  paper,  undoulitedly  infinencod  by  the  instrnction  of  Hestenes  and 
Valentine  in  his  graduate  education  at  UCLA,  often  employs  this  method 
to  develop  insight  in  elementary  courses.  Fcir  example,  a,  quick  way  to  de¬ 
rive  Lagrange  multiplier  theory  for  general  nonlinear  progra, mining  is  to  add 
squared  slack  variables  to  the  inecpiality  constraints,  then  apply  the  stan¬ 
dard  theory  for  equality  constrained  nonlinear  prograimning.  This  approach 
does  not  establish  the  nonnegativity  of  the  multipliers  of  the  inequality  con¬ 
straints;  however,  their  nounegativity  follows  directly  frc3m  the  second-order 
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necessity  conditions.  (Collectively,  then,  one  ol)tains  the  same  first-order  and 
second-order  necessity  conditions  from  this  elementary  approach.  The  price 
that  one  pays  is  that  regularity  (linear  iiide])endence  of  the  gradients  of  the 
binding  constraints)  must  l)e  assumed,  so  that  it  is  not  possible  to  reap  the 
benefits  of  more  subtle  constraint  (pialilications. 

Given  these  interests  of  the  ( du('ago  school,  it  made  perfect  sense  for 
Graves  to  assign  to  Karush,  as  a  topic  suitable  for  his  master’s  thesis,  the 
simple  problem  of  extending  Bliss’s  fiintf  -diiat  vsional  treatment  of  equal¬ 
ity  constraints  to  the  case  of  iinvpiaiity  constraints.  Karush  [i2]  handled 
his  assignment  beautifully,  deriving  first-order  necessity  conditions  using  the 
Farkas  lemma  and  formally  stating  the  Kuhn-Tucker  constraint  qualifica¬ 
tion  as  “Property  CJ.”  Thus,  Karush’s  192!)  thesis  ('ontains  the  Kuhn-Tucker 
theory  in  all  of  its  particulars.  The  constraint  qualification  Q  was  implicit 
in  Bliss  [3],  who  showed  that  it  was  implied  lyy  regularity,  which  he  then 
assumed.  Karush,  however,  assumed  only  the  constraint  qualification,  ac¬ 
cording  it  a  privileged  status  that  Bliss  had  not. 

One  characteristic  of  tlie  usual  (Kuhn-Tucker)  approach  to  Lagrange  mul¬ 
tiplier  theory  is  that  it  requires  the  multiplier  of  the  gradient  of  the  objective 
function  (say  Ao)  to  l)e  unity.  This  requirement  is  what  necessitates  a  con¬ 
straint  qualification;  it  is  genei  ally  known  tliat  this  hypothesis  is  unnecessary 
if  Ao  is  allowed  to  vary  freely.  For  nonlinear  prograinming  with  inequality 
constraints,  this  fact  is  usually  attributed  to  .lohn  [11]  and  called  the  Fritz 
John  condition.  (The  device  itself  was  first  studied  by  Mayer  [15];  in  fact, 
this  is  the  standard  formulat  ion  of  Lagrange  multiplier  theory  in  the  calculus 
of  variations.)  It  is  of  value,  here,  to  point  out  that  Karush  also  presented  a 
form  of  the  John  theory.  A  straightforward  apj>lication  of  Bliss’s  treatment 
of  the  equality  constrained  prol)iem  to  the  scpiared  slack  variable  formulation 
of  the  inequality  constrained  problem  le<l  him  to  a,  result  that  is  identical  to 
John’s,  but  without  the  sign  restrictions  on  the  multipliers.  Karush  then  ob¬ 
served  that  the  sign  restrictions  followed  from  Bliss’s  second-order  necessity 
theory.  As  we  have  already  ol)served,  however,  the  price  of  this  derivation 
is  the  assumption  of  regidarity  and  two  contiiiuous  derivatives.  Accordingly, 
Karush  employed  this  a,i)proach  in  a.  manner  that  we.  ha,ve  attributed  to  the 
Chicago  school,  viz.  to  o1)ta,in  insights  that  led  him  to  the  Karush-Kidm- 
Tucker  theory. 

Karush  was  never  encouraged  to  publisli  his  work,  presumably  because 
the  finite-dimensional  case  was  Iheii  deemed  too  elementary  to  be  of  inde- 


pendent  interest.  It  reinained  virttudly  nnknown  until  Kiilin  discovered  a 
reference  to  it  in  a  1974  texthook  Ijy  Takayaina  [19]  and  obtained  a  copy 
of  the  thesis.  Knhn  at, tempted  to  “set  the  record  straight”  at  a  1975  AMS 
symposium  (Kidiii  [13]),  a.nd  even  went  so  far  as  to  refer  to  the  “Karush 
conditions.”  The  same  year,  Hestenes  [10],  who  had  directed  Ka, rush’s  Ph.D. 
thesis,  noted  Ka, rush’s  work  in  his  l)ook  on  optimization  theory.  Even  so,  it 
is  only  very  recently  that  Karnsh’s  work  has  been  generally  acknowledged. 

In  summary,  our  historical  investigations  have  led  ns  to  strongly  sup¬ 
port  the  “Kariish-Kuhn-Tn<  ker”  terminology.  There  is  no  question  of  the 
importance  of  the  Kuhn-Tucker  paper  in  the  history  of  mathenratical  pro¬ 
gramming,  but  there  is  also  no  (piestion  that  Karnsh  obtained  the  identical 
result  twelve  years  earlier.  Fm  thermore,  not  only  is  Karnsh  himself  deserving 
of  recognition,  but  we  believe  that  the  use  of  his  name  is  a  fitting  tribute  to 
the  members  of  the  great  (!hi<'ago  school  of  Bliss,  whose  deep  understanding 
of  mathematical  programming  has  not  l)een  pro])eiiy  recognized  and  appre¬ 
ciated.  Because  it  was  not  until  the  195()s  that  there  was  a  demand  for  the 
finite-dimensional  theory,  the  ( fiiicago  researchers  were  simply  two  decades 
ahead  of  their  time.  The  Karush-Kvdm-Tvn'ker  conditions  represent  a  rare 
instance  in  which  it  is  possible  to  document  just  how  much  of  mathematical 
programming  these  researchers  understood  and  anticipated. 

Returning  to  the  present  paper,  our  objective  is  to  extend  the  classical 
Farkas  lemma  approach  to  mathematical  programming  problems  in  which 
the  domain  of  the  olyjective  fum  tion  is  Hilbert  space  and  the  constraint 
functionals  are  indexed  lyy  an  arbitrary  set.  Our  approach  carefully  mimics 
the  finite  programming  development.  It  is  ba,sed  on  a  generalized  Farkas 
lemma,  and  replaces  the  Lagrange  multii)lier  vector  with  a,  measure  on  the 
(possibly  infinite)  index  set.  If  I  bis  measure  is  aljsohitely  continuous,  then 
it  can  be  represented  as  a,  (density)  fuin  tion  on  the  index  set.  Because  our 
point  of  view  may  seem  unnatural  to  some  readers  otherwise  familiar  with 
Lagrange  multiplier  theory,  we  briefly  digress  to  motivate  it. 

Consider  vectors  i/q, . . . ,  xi,  €  R“,  scalar  weights  (q, . . . ,  11,^  G  R",  and  the 
weighted  sum 

where  the  index  set  I  =  {!,...,  A}.  By  defining  a,  measure  /;,  on  the  Borel 
sets  of  R,"  that  concentrates  on  {./  j , .  .  . ,  .r^}  and  satisfies  //({.r,:})  =  we 


(i 


can  write 


Thus,  a  set  ot  weights  can  Ire  viewed  a,s  a  inea.sure  and  a  weighted  sum  can 
be  viewed  as  a  (Lebesgue)  integral  with  respect  to  that  measure.  When  the 
weights  are  nonnegative  and  sum  to  unity,  //,  is  a  probability  measure  and 
probabilists  call  the  integral  an  e.xpectation. 

Now  consider  the  index  ma,j)  i  e-r  Xi,  which  embeds  /  in  R“.  The  measure 
ft  induces  a  measure  •//,  on  the  su!)sets  of  I  Iry  '//({'/’})  =  This  allows 

us  to  further  write 

=  /  xii{ilx)~  xrn{(li)  ; 

,6/  >1 

hence,  our  point  of  view  tliat  a.  set  of  weights  is  a  measure  on  an  index  set. 
It  is  this  perspective  that  will  lead  lo  a.  manageable  statement  of  generalized 
first-order  conditions. 

The  flavor  of  our  generalization  of  Lagrange  multiirlier  theory  is  not  en¬ 
tirely  new.  Semi-infinite  prograinming  is  also  concerned  with  problems  in 
which  the  constraint  ftinctioiials  are  indexed  by  an  infinite  set,  although  the 
domain  of  the  objective  fuiK'tion  is  still  assumed  to  be  Enclideaii  space.  The 
famous  paper  by  John  [11]  po.sed  a.  .semi-inhnite  prograinming  problem;  how¬ 
ever,  John  exploited  the  finite  dimensionality  of  to  reduce  the  number 
of  constraints  to  n  -(-  1.  More  rec  ently,  a  multiplier  theorem  of  precisely  the 
sort  that  we  seek  was  obtained  by  (loberna,  L6i)ez,  and  Pastor  [8].  The  au¬ 
thors  use  a,  generalized  Farkas  lemma  and  retain  the  fidl  set  of  constraints; 
however,  their  result  also  depends  critically  on  the  hnite  dimensionality  of 
Euclidean  space. 

It  should  be  noted  that  a  numl)er  of  authors  have  published  multiplier 
theorems  in  very  alistract  settings.  The  standard  tormulation  is  that  of 
Guignard  [9],  who  derived  liotli  necessity  and  sufficiency  conditions  for  the 
problem 

maximize 

snliject  to  X  ^  C  d  X 

<i(.r)  e  B  CY  , 

where  X  arid  1'  are  real  Banach  sjiaces  and  '0  :  X  — >  (— co,-1-cxd)  and  a  : 
X  Y  are  Frechet  differentiable.  Guignard’s  multiplier  is  an  element  of  the 


topological  dual  space  of  1' ,  and  her  entire  approach  is  markedly  different 
from  ours. 

The  primary  purpose  of  the  present  j)a.per  is  pedagogical.  That  is,  we  wish 
to  demonstrate  that  by  (i)  interpreting  the  vector  of  Lagrange  multipliers  as 
a  measure  on  the  index  set  of  constraints  and  by  (ii)  utilizing  tools  from 
functional  a.nalysis  and  probability  theory,  the  standard  finite-dimensional 
approach  to  multiplier  theory  (Karush-Kuhn-Tucker)  can  be  successfully  gen¬ 
eralized  to  infinite  progrannniug  in  Hill)ert  s})ace.  This  exercise,  however,  is 
not  entirely  pedagogical,  for  we  also  l)elieve  that  there  are  important  infi¬ 
nite  programming  problems  to  which  our  theory  can  be  profitably  applied. 
Therefore,  after  in  Section  'I  deriving  first-order  nec’essity  conditions  for  gen¬ 
eral  infinite  progranimiiig  i)r<)blems,  in  Section  d  we  will  consider  results  that 
facilitate  the  use  of  these  conditions.  In  Section  4,  by  way  of  ari  example,  we 
will  also  apply  this  theory  to  ol)tain  necessity  conditions  for  a  constrained 
optimization  problem  from  the  statistical  literature  on  probability  density 
estimation.  However,  we  have  deferred  to  another  paper  (Trosset  [21])  an 
investigation  of  the  statistical  cousequeiK'es  of  these  conditions. 


2  Main  Theorem 

We  begin  with  a  real  Hilbert  space  X  with  inner  product  (•  ,•).  By  the 
general  nonlinear  programming  problem  —  proldem  (NLP)  for  short  —  we 
mean  the  constrained  oj)tiiinzatioii  problem 

maximize  ,/(•'•) 
subject  to  ^  d  Vo'  G  / 

I'l-ii-'')  =  d  V/f  G  •/  , 

where  '■  X  (— oo,-|-oo).  We  a,s.sume  that  the  index  sets  I  and  ,7 

have  corresponding  sigma,  fields  I  and  J  such  that  the  pairs  (7,1)  and  (•/,  J) 
are  measure  spaces.  Measures  on  t  hese  sjuic.es  will  be  denoted  by  t,  i/,,  A,  etc. 
At  times,  we  will  also  endow  7  and  ./  with  to])ologies.  Typically,  7  and  J  will 
be  subsets  of  Euclidean  space.  For  each  x  G  X,  we  define  the  index  subset 
7o(.c)  :=  {a  G  7  :  <7„(.i  )  =  d}. 

We  assume  that  fiHaJiii  G  For  each  x  G  AT,  the  sets  VAo(.t)  :  = 

{V^/„(.x)  :  O'  G  7()(,x)}  and  VB(x)  :=  {V/qfi.x)  :  (i  G  .7}  are  assumed  to  be 


Borel  measurable.  We  also  assume  tlial,  the  index  maps  a  V(y„(x)  and 
fi  ^hp[x)  are  Borel  l)imeasura.l)le  luuetious.  This  will  enable  measures  on 
/  and  J  to  induce  measures  on  A’,  and  also  conversely.  Measures  on  X  will 
be  denoted  VF,  VG',  etc. 

For  technical  reasons,  we  will  somet.imes  further  assume  that  the  functions 
(ja  and  hfj  are  elements  of  a  real  Hilbert  space  F.  In  that  event,  we  will  assume 
that  the  sets  Ao{x)  {/y,,  :  n-  G  /(l(•r)}  and  B  :=  {hp  :  ^  .7}  are  Borel 

measurable.  We  will  also  assume  that  the  index  maps  a  and  fi  i— >  hp 

are  Borel  measurable  functions.  This  will  enal)le  measures  on  I  and  J  to 
induce  measures  on  F.  Such  measures  will  l)e  denoted  by  F,  G,  etc. 

In  this  section  we  will  derive  necessary  conditions  for  a,  point  x*  6  X  to 
be  a  local  solution  of  problem  (NLP).  To  do  so,  we  modify  and  generalize 
Fiacco’s  and  McCJornuck’s  [(>]  presentation  of  the  first-order  theory  for  the 
finite-dimensional  ('a.se.  Tin*  key  to  this  generalization  is  the  concept  of  the 
expectation  of  a  measure  on  a.  Hilbert  sj)ace.  Toward  this  end,  in  what 
follows  H  will  denote  a  real  Hilbert  space  with  inner  product  (•,•)■  Following 
Parthasarathy  ([16],  DefiniHon  6.2,  p.  168)  we  make  the  following  definition. 

Definition  2.1  Ltf.  //  bt  «  nudsurt  on  H.  If  flit  liiitiir  f'ii.v.ctional  L[y)  :  = 
j  (y ,  x) fi{dx)  i$  coHtvmious ,  tlu  ii  flit  rxiitctofiov,  of  y.,  which  wt  denote  by 
j  xyfdx),  is  defined  to  be  the  Hies-  repri  senter  of  L. 

At  this  point  it  will  be  of  valin^  to  introduce  some  l)a,sic  notation.  Let 
M(/i )  denote  the  family  of  totally  finite  positive  measures  that  concentrate 
on  the  set  K  C  H,  and  let.  M|(A’)  denote  the  family  of  probability  mea¬ 
sures  that  concentrate  on  the  set.  h  C  H .  We  are  interested  in  the  sets  of 
expectations 

C(K)  =  {  I to  ■  /,  €  M(A')} 

and 

Gi(A')  =  {|.r/,(r/.r):/,,eMi(A')}  . 

The  set  C\{K)  is  essent  ially  the  convex  hull  of  A^,  and  the  set  C{K)  is 
essentially  the  cone  generated  by  G|(A’).  It  should  be  clear  that  C{K)  and 
are  convex.  In  the  next  section  we  will  demonstrate  that  G’i(A')  is  also 
compact.  The  closedness  of  f ’( A  )  will  be  of  fundamental  importa,nc.e  in  our 
theory.  In  the  next  se(  tion  we  will  construi't  a,  condition  which  guarantees 


that  C{K)  is  closed.  However,  for  the  moment  we  will  assume  that  it  is 
closed. 

We  now  generalize  a  famous  result. 

Lemma  2.1  (GencrdHzf  d  Farkas  Lt  iiiiiLa):  Lt  t  H  denote  a  real  Hilhert  space 
with  inner  product  (■  ,•).  Lit  x,,  G  F  and  K  C  H.  Assume  that  C{K)  is 
closed.  Then,  the  following  are  e<iuiv(dt  at: 

(i)  \/  y  G  H  ,  (//,•'■)  >  0  V  .r  G  K  entails  >  0  ; 

(ii)  3 //,  G  M(/\)  such  that  :ra  =  xp(dx)  . 

Proof:  We  utilize  the  notion  ol  a  dual  cone,  introduced  by  Dieudonne  [4] 
in  his  proof  of  the  Hahn-Banach  theorem.  The  dual  cone  (d*  of  a  cone  C  is 
the  set  of  all  continuous  linear  finu  tionaJs  nonnegative  on  C . 

Consider  (i).  It  (;(y,.r)  >  0  V.r  G  K,  then  {y,  J  xii.{dx))  =  f{y,x)p,(dx)  > 
0  V//.  €  i.e.  y  G  (l(I\)*.  Hence,  (i)  is  equivalent  to  the  assertion  that 

y  G  C(K)*  entails  (y,X(,}  >  0  Vy  G  H,  or  simply  that  X(,  G  C{K)**.  By 
Lemma.  5.6  in  Girsariov  [7],  ('[h  )’*  is  the  wea.k  closure  of  the  convex  hull  of 
C{K).  Since  C{K)  is  (dosed  and  convex,  it  follows  that  (i)  is  equivalent  to 
.x’o  €  C{K)**  =  G{K).  But  (ii)  is  a  direct  .sta.tement  that  x^)  G  C{K)]  hence, 
(i)  and  (ii)  are  equivalent.  □ 

Remark:  It  is  possilde  to  give  an  elementary,  but  more  complicated  proof  of 
this  result.  The  very  elegant  proof  that  we  have  presented  was  suggested  to 
us  anonymously  by  the  referee.  This  is  an  amusing  realization  of  Valentine’s 
[22]  admonition  to  “always  look  at  the  dual  situation  when  working  with 
convex  sets  lor  it  may  save  you  souk-*  embarrassment.” 

Associated  with  prolilem  (NLP)  is  the  generalized  Lagrangian  gradient 

f  (.T,  n,.  A)  :=  V,,. /■(:,:)  -  V.,y,.,(:r)v,,((/o)  +  VJ,.^,{x)X{dft)  , 

which  is  guaranteed  to  exist  il  the  sets  V/l(:/:)  and  VH(.r)  are  compact  and 
the  measures  a  and  A  are  totally  Unite.  Our  goal  is  to  derive  necessary 
conditions  for  solving  problem  (NLP)  that  involve  this  expression.  We  are 
now  in  a  position  to  (diaractcuize  some  of  these  conditions. 

Suppose  that  x  is  a  feasible  |>oint  of  luoblem  (NLP).  Let 

Zi(.x)  :=  {z  G  X  :  {z,Vy, ,(.'■))  >  0  Vo*  G  /..(.r), 

(;■.  V/qd.r))  =  0  V  /I  G  J,  {z,  Vfix))  >  0}, 
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Z2{x)  {z  ^  X  :  (2,  V//„(./’))  >  0  V  (v  6 

{z.yh, {:,■))  ^0  {z.  v/(,x-))<0}, 

Proposition  2.1  Let  K  =  Vyl„(:c*)  U  V5(.'r*).  Assume. 

Al  :  K  is  e.omp(i,ei; 

A2  :  i'lf>sed. 

If  X*  is  a  fe:asihle  point  0/  probh  iii.  (NLP),  then,  the  follovnng  are  equivalent: 

(i)  z.,(x-)  = 

(n)  There  exist  totally  JinH(  nunsures  n*  on  (/,I)  and  X*  on  (-/,  J)  such 
tha.t 

(a)  f(.T*,u*,A*)  =  0, 

00  >  0  V  n  G  /, 

(c)  hy{x*)  =  [)  y  lie  J, 

00  '«*(/')  =  0  V  /'  measurable  C  /  ~  /„(x*), 

(e)  >  0  V  7^  iiK  (isura.bh  C  7. 

We  will  refer'  to  the  conditions  (a.)-(e)  in  (ii)  as  the  generalized  first-order 
conditions. 

Proof:  Assume  (ii)  and  suppose  that  c  G  Then 

0  >  (zy/y)) 

=  {z,  j^  Vyyna*[da)  -  V 1,.^,:*)) X^dfi)) 

=  I^{zyyy,n)a0da)  -  l^{zyhyx.*))x*{dfi) 

>  0 , 

which  is  a,  contradiction.  This  ])roves  that  (ii)  implies  (i). 

Conversely,  suppose  that  Z2 (./'*)  =  (j).  Then,  if  satisfies 

fhy{x  ))  ^  d  V  O'  G  7()(,T*) 

(^,  V//,-4  ;,'•))>  0  V/fGJ 

{z,-Vhyr*))>{)  Wfie.J, 


If 


z  must  also  satisfy  (z,V/(:i'*))  >  0.  But  this  implication  is  (i)  in  Lemma 
2.1,  so  we  may  conclude  that 


We  now  obtain  conditions  (a,)-(e)  by  setting  //.*  =  to  on  Io{t*),u*  =  0  on 
I  ~  /o(.'c*),  and  A*  =  -(/'  -  /")  on  ./.  □ 

Our  statement  of  first-order  condition  (a.)  is  somewhat  nontra.ditional. 
Suppose  that  the  and  hfj  are  elements  of  a.  real  Hilbert  si)ace  F.  Assuming 
that  the  indicated  expectations  exist  (which,  of  course,  they  may  not),  define 
the  generalized  Lagraugiau  fmicl  ion  to  be 

A)  ;=  f(:r)  -  jUv)u{da)  +  hg{x)\{d(i)  . 

To  conform  to  coinmou  luactice,  we  woidd  write  condition  (a)  as 
V,.f(:rL  A*)  =  (‘'[x\v\X*)  =  0  . 

The  following  result  establishes  cir('umsta.nces  in  which  this  representation  is 
legitimate. 

Proposition  2.2  Fix  x  G  X .  Li  t  a  and  A  dt  iiotf  totidly  finite  rnexisures 
on  {1,1)  Of’d  (■/,  J).  Assunif  llidt  tin  t  xpi  ctations  g  :=  jj  gniifda)  and  h  :  = 
jj  hfi\{df'{)  hath  exist.  If  tin  .■>i1s  of  junctions  A  :=  {g„  :  O'  G  /}  and  B  are 
each  uniformly  Lipschitz  continuous,  then  V,.,.f(.7:,  a,  A)  =  ("{:r,u,  X). 

Proof:  We  must  e.stablish  that 

V  I  g„(x)u{d(y)  =  Vg{x)  =:  I  gVF{dy)=  f  Vy„{:i:)u{da)  ,  (1) 

■I  I  ./V.'l(.7.)  .// 

where  Vi^  is  the  measure  on  A  induced  by  n.  (dearly  it  suffices  to  prove 
that 

(v/,V,y(.T))  =  (■//,/■  yVF{dy))  Vy/GX. 


(2) 


We  note  that 


('/y,  V5(.x'))  =  [/(r)(ii) 

1 


=  liin-{//(x  +  t//)  -  fiix)} 

=  |hii  l^[<lc,{x  +  eii)  -  g,,{x)]u{da)} 
=  \\m  I  (j)f{x)'ii.ida) 


and  that 


(r/,  /  yVF{dy))  =  /  {v,y)VF{dy) 

JVA{x)  ■IVA{.r) 

=  V//„(.T))n(do') 


=  /  liin  -[</„(x  +  Oj)  -  (/,,{x)]u{do) 

.1 1 ' — "  e 

—  /  Win  (J),  (o’yii.( da)  . 

Jl  " 


Since  the  //„’s  are  nnifonnly  Ihpsc'liitz  continnous,  i.e.  3  M  <  oo  such  that 
Idaill)  -  <  ^^11?/  -  -11  V  x.  y  €  A'  and  V  o-  G  /,  we  have 

<  — Ik'/ll  =  M\\ii\\  <  oo  . 

( 


Then,  since  u  is  totally  hnite,  vve  can  a.])])ly  the  Dominated  Convergence 
Theorem  to  interchange  lim,_()  and  /;.  This  establishes  (2);  hence,  (1).  The 
identical  argument  establishes  that 

Vy^/v4r)A(d/y)=  ^^VhM.T)A(d/l), 

and  the  resnlt  follows.  □ 

We  now  return  to  problem  (NLT).  As  in  the  finite-dimensional  case,  in  or¬ 
der  to  derive  a  necessity  condil  ion  from  Proposition  2.1,  we  must  sui^plement 
the  hrst-order  conditions  with  a,  constraint  qualification. 


Definition  2.2  Suppofie  tiud  x*  is  a  ftasihU  point  of  prohlern  (NLP).  We 
say  that  x*  satisfies  the  consfrainl  (ju(iJifi'ai.tioii  for  prohhm  (NLFf  if: 


for  each  nonzero  z  E  X  .'^atisfyiny  {z,V(ii,{x*))  >  0  V  a  6  nnd 

{z.,Xhfi(x*))  =  0  V  /^  €  •/,  there  txists  r  >  0  a,nd  a  continuous  arc 
6’  :  [0,t)  — >  X  satisfyiny 

r;(o)  =  .r* , 

6’'(())  =  , 

V  /  G  and  \f  a  E  I  , 
h.fi{(d{t))  =  0  V  /  €  [0,  r)  and  'i  ft  E  d  ■ 

Our  main  result  now  follows  precisely  as  in  the  finite-dimensional  case. 

Theorem  2.1  Let  K  =  V/l(,(.r’)  U  V/f(.r*).  Assume 
Al:  K  is  compact; 

A 2:  C(K}  is  closed. 

If  X*  sa.tisfi.es  the  constraint  <jU(difi(-ation  for  jtroblem.  (NLP),  them,  a  nec¬ 
essary  condition  for  x*  to  he  a  load  solution  of  problem  (NLP)  is  that  the 
first-order  cond.itions  hold. 


Proof:  We  invoke  Proposition  2.1.  Su|)pose  that  x*  is  a.  local  solution  and 
that  2  E  Z-fix.*).  Olearly  il  must  he  that  _  /  0,  so  there  exists  a  feasible 
continuous  arc  C  :  — >  .V.  Since  .r*  is  a,  local  solution,  for  /  >  0 

sufficiently  small  it  must  be  that 


and  therefore 

But  this  implies  that 


[/o6T(())  =  {^ficmxi'm  =  (v/(.7:*),z)  >  0, 


which  is  a  contradiction. 


□ 
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3  Discussion  of  Hypotheses 

Let  iis  now  examine  assumptions  A  J  and  A2  ot  Theorem  2.1.  To  begin,  recall 
that  they  automatically  hold  in  the  case  of  Unite  programming.  Specifically, 
a  finite  set  is  compact,  and  it  is  well-known  that  a  finitely  generated  cone 
is  closed.  Hence,  assumj)tions  A1  and  A2  are  exactly  the  price  one  must 
pay  to  extend  the  Farkas  lemma  approach  to  necessity  conditions  from  finite 
programming  to  infinite  programming.  Of  course  this  extension  will  be  of 
no  value  if  we  cannot  find  reasonable  ('onditions  that  imply  assumptions  Al, 
A2,  arid  the  constraint  (pialificatioii.  Such  conditions  are  the  subject  ot  the 
present  section;  in  the  next  section,  we  will  present  a  meaningful  exarnple 
that  satisfies  our  conditions. 

We  first  consider  assumiitiou  Al,  the  comi)a.ctness  assumption  for  the  sets 
VA(j(.t*)  and  Since  i)rol)lem  (NLP)  is  stated  without  reference  to 

these  sets,  it  is  obviously  cumbersome'  to  clieck  hypotheses  involving  them. 
Fortunately,  many  problems  will  not  re((uire  this. 

Lemma  3.1  Suppose  that  tin  //,,  ami  //,<  are  eleiii.ev.ts  of  a  real.  Hilbert  space 
r.  Fix  X  6  X  ami  let  V.,.  dmoU  enalaatiov  at  x.  Suppose  that  the  (j„  hp 
are  uniformly  continuous  and  that  V,.  is  a  continuous  functional  on  A  and 
B.  Assume  tha.t  the  index  sds  I  and  J  have  been  topoloyized.  If  I  and  J  are 
compa.c.t  and.  the  index  maps  <\  i— >  //,,  and  fi  i— +  hp  are  continuous,  then  the 
sets  VAo(''-)  (I'lxl  XB{x)  are  compact. 

Proof:  We  argue  iii  terms  ol  tlie  fp,..  (liven  a,  sequence  {a,,}  C  Lo(.x'), 

we  claim  that  there  exists  ou  €  Io{x)  arid  a  suli.sequence  such  that 

Xya„,{x-)  -> 

The  indexing  assumptions  imply  that  A  is  comjiact.  Since  Vi,.,  is  con¬ 
tinuous  on  A,  it  follows  that  the  level  set  y4()(.f:)  =  {(y„  :  y,y{x.)  =  0}  = 
{do  ■  W(<7o)  =  0}  i«  closed,  hence  compact  itself.  Therefore  there  exists 
Oo  £  T)(-x)  arid  a  subsequence  {o,,/}  suc  h  that 

The  convergence  indicated  is  in  norm.  However,  since  the  (/„  are  uniformly 
continuous,  the  convergence  must  also  be  uniform.  But  this  allows  us  to  write 


We  now  derive  conditions  which  imply  that  assumption  A2  holds.  We  first 
derive  a  technical  lemma  about  expectations  that  will  be  used  to  show  that 
K  compact  and  0  ^  6’i(A')  implies  A2.  This  lemma  derives  from  probability 
theory.  An  excellent  relerence  lor  the  recpiisite  material  is  Billingsley  [2]. 

Lemma  3.2  Let  H  denote  a  r(  (il  Hilbt  rt  space  with  inner  prndii.ct  (•  ,  •).  Let 
Ml  (A')  demote  the  fannily  of  prolialiility  ineasii.ies  that  concentrate  on  the  set 
K  C  H.  If  K  is  compact,  tlu  n  IIk  set  of  expectations  Ci{K)  :=  {j  xfi(dx)  : 

//,  e  Mi(A')}  is  convex  a.nd  compacl. 

Remark:  As  mentioned  before,  the  set  (\(K)  is  essentially  the  convex  hull 
of  K. 

Proof:  Since  K  is  compact,  /  (y/,  x)p{dx)  <  ||y/||  /  ||.7:  \\p{d.x)  <  ||y||.sup,,g;^-||.T||  < 
oo]  we  are  therefore  assnre<l  t  hat.  I, he  exi)ectations  exist.  The  convexity  of 
Ci{K)  follows  immediately  from  the  linearity  ol  expectation. 

To  demonstrate  compactness,  consider  the  sequence  {x^^  =  /  xfirfdx)  : 

//,„  €  Mi(A")}.  Since  K  is  compact,  Mi(A')  is  tight.  It  follows  from  Pro¬ 
horov’s  Theorem  that  there  exists  a  weakly  convergent  sul)sequence  of  {//.«}, 
i.e.  that  there  exists  //o  €  Mi  (A')  and  a  siilxsequence  {//„/}  such  that 

I  I  fi-AMdx) 

for  all  bounded  continuous  functions  (j>  :  H  (— oo,-foD).  Since  K  is  com¬ 
pact,  {'!/,■)  is  such  a  function;  hem  e 

j  {ih'-AthAd-A  ^  I  Vy/  eH. 

Then  it  must  be  that  the  Kiesz  repiesenters 


SO  the  arbitrary  sequence  {.r„}  has  a  (onvergent  sid)sequence.  □ 

We  now  remove  the  restriction  that  the  jiositive  measures  used  to  form 
expectations  have  a,  total  mass  ol  mnty. 

Lemma  3.3  Let  H  denoh  a  r(  id  Hilbert  space  with  inner  prodnct  (•  ,  •)  and 
oriyin  0.  Lei  M(A")  denote  the  family  of  totally  finite,  positive  m.e.asures 
that  conceninde  on.  the  set  K  C  H-  If  H  is  compact  and  0  ^  Ci{K),  then 
C{K)  :=  {f  xp.{dx)  :  p  G  M(A')}  is  convex  and  (loseel. 


l(i 


Remark:  As  inentioiied  bel'oie,  the  set  ('(K)  is  essentially  the  half-cone 
generated  by  the  convex  hnll  of  K . 

Remark:  The  conditions  that  f\  is  ('oinpact  and  0  ^  Ci{K)  are  sufficient 
but  not  necessary  for  the  <  oik  hision.  To  illustrate,  let  S  be  a.  closed 
subspace  of  H  and  let  A’  C  A  be  any  set  such  that  0  is  an  interior  point 
of  C\{K)  C  S  relative  to  A',  e.g.  {.r  G  A'  :  j|.x||  <  1}.  Then  0{K)  —  S 
is  automatically  convex  and  (dosed.  Hfjwever,  the  simple  conditions 
stated  in  the  lemma,  have  a.  natural  analog  in  the  finite-dimensional 
theory  and  are  entiiely  a(le(|ua.te  for  the  exaini)le  of  Section  4. 


Proof:  Writing  C'{K)  =  {/m:  :  x  £  (\{K)  ,  r  =  [0,  -(-oo)},  it  follows  from  the 
convexity  of  6'i(A")  that  f '( A’)  is  a  convex  half-<’one.  We  (daim  that  C{K)  is 
also  closed. 

Toward  that  end,  sujjpose  that  {//„}  C  C{K)  with  ||y/.„  —  y\\  — >  0.  Write 
Vn  =  with  ,T„  G  6'i(A').  lly  the  compactiK^ss  of  C\{K),  {;(:„}  contains  a 
subsequence  with  ||.r„<  —  r||  -+  0  for  some  x  G  C\{K).  Moreover,  since 

Q^C\{K),  l|,f||  X). 

Now  let  t:  >  0  be  arbitrary.  By  construction,  there  exists  A^(t:)  such  that 
n'  >  A^(t)  entails  Wx^t  —  .7:||  <  (  a.nd  H/n'-r,,/  —  ;(7||  <  t.  It  follows  that,  if 
e:  <  ||.t||  and  n'  >  iV(t),  then 


l/y||  -  ^ 
|d:||  +  < 


<  'V  < 


y  +  ^ 


so  that 


/ll/ll-'-l 

|.  Id 

(Ulce, 

—  /",r|| 

II'. 

< 

I'n'l 

0  . 

By  the  unicpieness  of  limits,  y  =  rx  G  ('{K).  □ 

Notice  that  the  hypothesis  that  0  ^  f  ''i(A')  is  closely  related  to  the  oft- 
imposed  (in  finite  programming)  condition  of  regularity.  A  feasible  point  x* 
is  said  to  be  rtytihir  if  the  set  K  is  linearly  independent,  i.e.  if  no  finite 
nonzer(5  linear  combination  of  the  constraint  gradients  at  x*  can  vanish.  Our 
condition  is  somewhat  strongei-  in  one  lespect,  Imt  inindi  weaker  in  another. 
On  the  one  hand,  we  consider  arl)itra.ry  measurers  (weights)  on  A^,  not  just 
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finitely  supported  ones.  This  is  ;i.n;ilogous  to  infinite  linear  combinations, 
hence  stronger;  on  the  other  hand,  vve  only  consider  probability  measures 
(nonnegative  weights  totalling  unity);  this  is  analogous  to  convex  combina¬ 
tions  instead  of  linear  c.oml)inations,  hence  weaker. 

In  finite  programming,  if  .r”  is  a.  regular  point,  then  x*  must  satisfy  the 
constraint  qualification.  This  j)leasa.iit  |)roi)erty  does  not  hold  in  infinite 
programming;  in  fact,  since  the  numl)er  of  linearly  independent  gradients 
cannot  exceed  the  dimension  of  the  space  X,  the  notion  of  regularity  is 
wholly  inappropriate  for  the  case  of  semi-infinite  ])rogra.mming  and  somewhat 
inappropriate  in  the  case  of  infinite  })rogra.mming.  Accordingly,  we  will  search 
for  other  conditions  that  will  imj)ly  the  constraint  (pialification. 

The  simplest  situation  is  t  lie  one  in  which  all  of  the  constraints  are  linear. 

If  X*  and  z  are  as  in  Definition  2.2,  I  lien  the  arc  (’(/)  =  .r*  +  iz  satisfies 

(/(())  =  r*; 

(!'(())  =  r: 

<j4C{t))  =  0  V/>(),  Vo'e7n(.T*); 

=  0  V /:>(), 

Moreover,  for  each  n-  6  /  ~  /()(•'■*)  (the  nonhinding  constraints),  there  exists 
r(a)  >  0  such  that 

<lA('U))  >  “  V  /  €  [0,r(rv)). 

If  the  number  of  nonbinding  constraints  is  finite,  then  we  can  take  r  = 
infa{r(cv)}  >  0  and  the  constraint.  <|ualification  is  automatically  satisfied. 
Otherwise,  it  may  be  that  inf,,  {r(n))  =  0  and  the  constraint  qualification 
may  not  hold.  We  are  therefore  content  to  estalilish  that  the  constraint 
qualification  holds  for  one  inijiort.aiit  lamily  of  exa.mi)les. 

Both  control  theory  and  statistics  abound  with  constraints  of  the  sort 
that  a,  function  lie  liounded  by  c  ei  t  ain  values.  The  following  resrdt  addresses 
the  prototypical  case;  we  hope  that  t  he  method  of  proof  will  suffice  for  a, 
variety  of  applications. 

Lemma  3.4  Let  X  denote  a  real  Ililhi  rt  space  of  fuvrtions  x  :  I  — >  (  — oo,  -foo). 
Let  (ja  denote,  evaluation  at  <v  G  /■  IJ  X  is  a  proper  functional  Hilbert  space, 
i.e.  if  the  (j„  are  continuous,  then  the  collection  of  inequality  constraints 

<y„(;r)  =  .r(n)  >0  V  cv  G  I 
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satisfies  the  constraint  qadHJicalion. 


Proof:  Since  the  are  (•ontiiinoiis  find  linear,  V(y„(.T)  exists  V  x  6  X. 
Suppose  that  x*  G  X  is  a  tefisihle  point  and  that  a,  nonzero  z  E  X  satisfies 

(z,  Vr/„0>n}  =  /4(r*)(r0  =  ^(o)  >0  V  o  G 
i.e.  V  n  siicli  that  =  0  . 


For  t  >  0,  let 

K(t)  :=  {//  G  -Y  :  :r*(o)  +  /y/(o)  >  0  V  o'  G  7}  . 

We  note  that  the  sets  A'(t)  fire  nested,  lor  siiiijiose  that  to  <  t\.  11  0  < 
x*{a)  +  ti'ijfa)  V  (V  G  I  ,  i.e.  t/  G  7v(ti),  then 

0  <  — .T*(n)  +  hi//(o  )  <  ^r*(o')  +  ton{o')  V  o-  G  7  , 

^1 

i.e.  y  G  7i(t(,).  Thus,  7\(7i)  C  A'(fii)- 

Suppose  that  t,,.  j.  f,,.  Then  U„7\'(/„)  C  7\(h,);  hence,  the  closure  of 
U„7v  (t„)  is  contained  in  the  closure  ol  7\  (/d), which  is  just  77 (to).  To  establish 
the  converse,  let  y/o  C  77(/o)  find  .set  y/„  =  (to/t„)y/o.  Then  y/,,.  C  77(t„)  and 
W'lJn  —  yo\\  h;  hence,  K{to)  is  contfiined  in  the  closure  ol  U„77(t,,). 

Suppose  that  j  to-  Then  n„7\(/„),  i.e. 

{  r*(o)  +  l„y{(\)  >  1)  Vcv  G  7}  . 

Then 

{.r’'(n)  +  /oy/(o)  >  0  Vfv  G  7)  , 

i.e.  y  G  K{to)- 

It  now  follows  from  Pio))ositiun  '.i.'Z'Z  in  Attouch  [l]  that,  if  — >  to; 
then  K{t.a)  K{to)  in  the  sense  of  Mosco-convergence  of  closed  convex 
sets.  Furthermore,  Proposition  -I. .’14  in  Attouch  [1]  states  that  the  Mosco- 
convergence  ol  closed  convex  sets  is  e(|uiva.lent  to  the  convergence  ol  the 
projections  of  an  arbitrary  point  into  these  sets.  Therelore,  let  z*[t)  denote 
the  projection  of  Z  into  K{t).  Tlien,  t,,  — >  to  entails  ||z*(/,i)  —  z*(/o)||  — >  0, 
and  we  conclude  that  r*(/)  is  a  continuous  arc  tor  /  >  0.  Moreover,  since 
x*(a)  =  0  entails  z(o)  >  0,  r  is  conlained  in  the  closure  ol  U/>o77(t).  We 
can  therefore  close  the  arc  by  selling  r*(())  =  r. 


P) 


Now  let  C{t)  By  roiistnictioii,  6’  is  a,  feasible  continuous 

arc  with  6’(0)  =  x*.  Moreover, 

j,mi|lC(()  -  C((l)  -  ).-||  =  li,u  |||(V(<)  -  tzW  =  lmj||C(()  -  ;||  =  0  , 

SO  6’'(0)  =  z.  This  verihes  the  coiiditioiis  specified  l)y  Definition  2.2.  □ 

Remark:  It  is  also  possible  l.o  give  an  eleineiitary  jjroof  that  the  arc  z*[t)  is 
continuous.  The  use  of  Mosco-couvergence  was  suggested  to  us  arioiiymously 
by  the  referee.  The  ef[uivaleiice  ol  Mosc  o-convergeiice  and  the  convergence 
of  the  projections  of  an  arbitrary  point  is  due  to  Sonntag  [18]. 


4  An  Example 


We  now  a.pi)ly  our  results  to  obtain  uec’essity  conditions  for  a.  well-known 
problem  from  the  statistical  literature  on  j)robability  density  estimation. 
Watson  and  Learlbetter  [24]  sought  to  minimize  the  mean  integrated  squared 
error  of  a.  kernel  irrolrability  density  estimator.  Specifically,  given  indepen¬ 
dent  arid  identically  distributed  random  variables  .Yj, . . . ,  Xn  with  probabil¬ 
ity  density  function  <*),  they  analyzed  the  optimization  prolrlem 


minimize 

hn^L^  {  —  '"X' ) 


u 


2 


dx  . 


It  turns  out  that  solutions  are  typically  not  everywhere  nonnegative,  which 
results  in  estimates  that,  are  not  themselves  jrrolrability  densities.  This  is  a 
matter  of  taste,  but  if  we  prefer  to  esl.imate  densities  with  densities,  then  we 
must  confront  the  constrained  opt  imization  prolrlem 


minimize 


subject  to 


1 


^  -  Ab)  - 


dx 


Xu{x)  >  0  X  G  (— oo,+oo) 

I"  =  1  • 


This  problem  does  not  yield  to  wiriational  methods,  making  it  a.  natural  can¬ 
didate  for  the  application  of  our  mult.i|)lier  theory.  We  proceed  to  forirrulate 
it  iir  that  context. 
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Consider  the  Sobolev  space  //' [01,0-2],  which  is  defined  by  endowing  the 
vector  space 

{.7:  e  //[o-i,o-2]  for  j  =  0,1} 

with  the  inner  product 


It  should  be  noted  that  tlie  derivatives  in  the  definition  of  [oi ,  02]  are  taken 
in  the  sense  of  distrilmtions.  If.  is  well  known  that.  [o] ,  02]  is  a,  proper  func¬ 
tional  Hilbert  space  and  that  each  element  of  [01,02]  is  absolutely  con¬ 
tinuous.  See  Appendix  I  of  'ih])ia.  and  Thomi)son  [20]  for  a  discussion  of  the 
analogous  Sol)olev  space,  //'(  — 00, -foo).  Notice  that,  if  ^  00, -f  00), 

then  the  restriction  of  8  to  [01,02]  is  an  element  of  [o-i ,  02]. 

We  now  return  to  tlie  problem  of  Watson  and  Leadbetter,  which  we  re¬ 
formulate  as  problem  (WL): 


(WL) 


minimize 

Xix^X 

subject  to 


-^7;.,,(o  -  Xi)  - 


./■(  -•„  )  =  E  r 

./  —  tX' 

>0  V  o  e  / 

h{:r„)  =  .r„(o)do  -1  =  0, 


da 


where  /  =  [01,02],  A'  =  i/' [or,  0-2],  and  ;r.„  denotes  the  extension  of  .t„  to 
(  —  00,  -f  00)  defined  by  .t„(o  )  =  0  if  o  ^  7;  and  where  the  ex])ectation  is  taken 
with  respect  to  the  independent  and  identically  distributed  random  variables 
Xi,i  =  l,...,n,  having  i)robability  density  function  ^  G  77^  (  — 00, -foo).  We 
have  modified  the  original  problem  in  two  wa.ys.  First,  we  have  demanded 
some  additional  smootlmess.  Se( ond,  we  have  restricted  attenticni  to  kernels 
supported  on  [0-1,02].  We  i)io('ee<l  to  ver  ify  that  Theorem  2.1  can  l)e  applied 
to  problem  (WL). 

The  point  evaluation  functionals  //,,  G  F  =  X*  are  l)oth  linear  and 
continuous,  hence  continuously  dillerentiable  and  also  uniformly  continu¬ 
ous.  It  is  also  easily  checked  that  /,//.  G  fT(A).  Furthermore,  the  set 
V7?(.-r)  =  {V/v.(.f:)}  is  obviously  compact.  We  a.Lso  have 


Lemma  4.1  For  prnhlcni  (WL),  llu  sit  VA(.t)  =  {Ve/„(.T)  :  n  G  7}  is 
compact. 
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Proof:  We  apply  Leninia  :1.1.  The  j)oiut  evaluation  functionals  14  G  T* 
are  continuous,  since  14(ry„)  =  ,(/„(•'■)  =  .(-(o').  Since  I  is  compact,  it  remains 
only  to  demonstrate  that  tlie  index  map  o  i— >  (/„  is  continuous. 

Consider  the  optimization  piohlem 

minimize  (n')]4/o' 

./(I  I 

suljject  to  —  ^>2  ■ 

It  is  a.  trivial  exercise  in  tlie  ('alciihis  ol  variations  to  establish  that  the  min- 
imizer  is  a  straight  line  with  slope 

=  i^>2  -  f>i)li<i2  -  <'i)- 

This  yields  a  minimum  ohjective  Fiiik  tion  value  ol  |/>2  —  I\(i2  “  H 

follows  that  any  x  €  with  :r(ej)  =  and  x{(i2)  =  ^>2  ninst  satisfy 

ll;r||-^>|/,,-/,,|Vk,-eil.  (3) 

Now  suppose  that  cv,,  — ^  oo  as  n  — >  00.  Then  (3)  allows  ns  to  write 

llt/»n  -  <Joo\\  =  -  fJao{-'')\  =  =^'T>  -  .'/;(rto)l 

<  suj)  |(v„  —  ou|?||:r||  =  |o',,  —  cvol'-^  — >  0  as  n  — >•  00  .  □ 

Next,  we  show  that  our  conditions  on  A  hold. 

Lemma  4.2  For  prohhni  (WL),  It  I  li  —  Vyl(.r)  U  V/(.(.f;) .  Then.  Ci(A)  does 
not  contain  the  origin  of  X  =  //'[oi,o^]. 

Proof:  We  exploit  the  fact  tliat  the  gradient  is  the  Riesz  representer  of  the 
directional  derivative.  Let  ;/  €  A  ;  then 

=  lmi|[.i:(o')  +  f//(fv)  -.r((>)] 

=  //(o) 
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and 


h'{x){i])  =  lini  j[li.{.r  +  /;/)  -  Ii-ix)] 


liiii  - 
t^i)  /, 


I  [■'■  +  fil]i<y)<l(y  —  —  I  x{a)da  +  1 

=  y^  v/(n)</n  . 

Hence,  V<7(v(,t)  must  satisly 

(V//„,(:r)  ,  //)  =  //(n)  V  '//  €  A" 


and  V/;.(.t)  must  satisfy 

{Vh{x)  ,  ij)  =  I  ii(o  )(l(y  V  ti  G  . 

Now  suppose  that  there  exists  //  G  Mi(A')  such  that  jj^yit{dy)  =  0.  Let 
A  =  //,(V/?.(,t))  and  let  (i  -  A)a  denote  the  measure  on  (7,1)  induced  by  //.. 
Then  it  must  be  that,  V  //  G  A  , 


0  =  ((),//} 

=  {j^.yi>[*hi)-ii) 

=  l{y,ii)i'{<hi) 

•/  A 

=  /  {y,ii)ii{dy)  +  >y{'^li{x),ii) 

■IVA(r) 

=  /  '/(o  )/ddv/)  +  A  / //(odder 

./V/\(.r)  •// 

=  {I  —  X)  I  //(o)(/(//o)  +  A  l^  ii{n)dxy  .  (4) 

But  the  last  expression  in  (4)  is  strictly  positive  il  //  G  A"  is  strictly  positive 
on  7;  hence,  6'i(7v)  cannot  contain  the  origin  ot  X.  D 

Remark:  If  a  is  a  finitely  sui/j/orted  signed  measure,  say  (/  =  Id,-=i  '“7;l(o;i)5 
where  1  denotes  point-mass,  then  (4)  reduces  to 

0  =  (1  -  X)Y^u,ii{a,)  +  A  l^ii{a)da  . 
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If  A  =  1,  this  equalit}'  fails  lor  (say)  //(o)  =  I;  if  A  1,  this  equality  fails 
for  any  77  satisfying  '//(oi)  =  — (q  and  jj  ■ii{cy)(ln  =  0.  Thus,  the  condition  of 
regularity  also  holds  for  prohlein  (WL).  Notice,  however,  that  the  restriction 
to  finite  linear  comlunations  in  tlie  definition  of  linear  independence  is  crucial 
to  this  conclusion.  If  arl)itrary  signed  measures  are  allowed,  then  take  u  to 
be  the  negative  uniform  measure  on  /  and  ])ut  A  =  l/(o;2  —  -f-  1).  Then 

the  last  expression  in  (4)  is 


(1  —  A)  ^  7/(0')'(/.((/o  }  +  ^  ll{(v)(l(\‘ 

=  1 - - — 1  - ^ -  /  ii((\)d<y  +  - I  fi(a)da  , 

0'2  —  O  i  -f-  f  J  O  2  —  O  i  .//  0'2  —  7Vl  +  f  ■7/ 


which  does  indeed  vanish  V  //  G  A  .  This  distinction  should  not  be  surprising. 
Roughly  stated,  finitely  many  values  do  not  determine  a  function’s  Lebesgue 
integral,  but  all  values  together  do. 

Finally,  the  equality  coiistridiil.  in  problem  (WL)  is  easily  incorporated 
into  the  proof  of  Lemma.  4.4.  d’liis  ])rovides  a.  means  oi  verifying  that  any 
feasible  point  for  problem  (WIj)  satisfies  tlie  constraint  qualification.  Theo¬ 
rem  2.1  therefore  api)lies:  a  necessiiry  condition  tor  .7.*  to  be  a.  local  solution 
of  problem  (WL)  is  that  the  iirst-order  ('onditions  hold. 

Let  us  make  some  further  observaf  ions  concerning  ])rol>lem  (WL).  The 
objective  function  is  strictly  <'onvex  and  the  constraint  set  is  convex.  It  fol¬ 
lows  that  any  local  solution  will  be  the  unique  global  solution.  It  is  well 
known  that  the  variational  inecpiality  whic'h  serves  as  a.  necessity  condition 
when  the  constraint  set  is  convex  serves  as  a  sufficiency  condition  when  the 
objective  function  is  also  conve.x.  A  liitlier  straightforward  argnment  can  be 
used  to  show  tliat,  in  the  case  ol  a  convex  constraint  set,  condition  (i)  of 
Proposition  2.1,  namely  =  A,  implies  the  variational  inequality  neces¬ 

sity  condition.  These  comments  say  that,  in  the  case  of  a  convex  program 
where  the  constraint  qualification  holds  (as  is  the  case  for  problem  (WL)), 
the  existence  of  Lagrange  multipliers  (Proposition  2.1)  is  l)oth  necessary  and 
sufficient  for  x*  to  be  a  glol)al  minimizer. 

Onr  theory,  the  above  commejits  and  some  straightforward  comprrtations 
lead  us  to  the  following  result  conceruing  prol)lem  (WL):  .r*  is  the  unique 
global  minimizer  if  ;uid  only  if  t  here  exists  a  totally  finite  measure  concen¬ 
trating  on  [o'i,fV2],  witli  density  function  u*,  and  a.  real  number  A*,  such 
that 
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(a)  «:(«)  =  *  S  *  A](n  )  -  2[^  =1=  h]{a)  +  ^.r,:(a)  +  V  O'  6  [«i,  «2], 

(b)  .T*  (o')  >0  V  rv  G  [ni,fv-2], 

(c)  Q  <{(yya  =  1, 

(d)  x-; («)'(<,. (o:)  =  0  V  rv  e  [rvi,n-.2], 

(e)  <(«)  >  0  V  O'  G  [o'i,0  2]. 

In  condition  (a),  ^(fv)  :=  A(— o),  and  *  denotes  convolution. 

Since  problem  (WL)  is  highly  nontrivial,  it  is  not  surprising  that  the  cor¬ 
responding  necessity  conditions  are  sonievvhat  ('oinplicated.  A  more  detailed 
analysis  of  these  cc)nditions  was  undertaken  by  Trosset  [21].  Nevertheless,  it 
is  evident  from  the  material  ])resen1cd  here  that  the  theory  developed  in  Sec¬ 
tions  2  and  il  can  Ive  pioductively  applied  to  a.  body  of  problems  axlmitting 
an  infinite  prograrnming  formulation. 
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